cs"; When does the top homology of a random 

o : simpHcial complex vanish? 

L. Aronshtam* N. Linial^ 

February 27, 2013 



S 



Abstract 



in 



O 
U 

j^ ■ Several years ago Linial and Meshulam [7] introduced a model 

called Xd{n,p) of random n- vertex d-dimensional simplicial complexes. 

The following question suggests itself very naturally: What is the 

threshold probability p = p{n) at which the (i-dimensional homology 

^ , of such a random d-complex is, almost surely, nonzero? Here we de- 

^N I rive an upper bound on this threshold. Computer experiments that we 

-^ ■ have conducted suggest that this bound may coincide with the actual 

Cn ■ threshold, but this remains an open question. 

cn 

o 

^ : 1 Introduction 

We study random simplicial complexes in the Xd{n,p) model which was 
/\t . introduced in [7] and further studied in [9, 6, 3, 2, 1]. We quickly recall 

d ', the basic features of this model. A simplicial complex X ~ Xd{n,p) has n 

vertices, and a full {d — l)-dimensional skeleton. The d- dimensional faces of 
X are selected uniformly and independently with probability 1 > p > 0. The 
parameter p may (and actually will usually) be dependent on n. We fix once 
and for all an arbitrary field F and let Hi{X) = Hi{X;¥) denote the i-th 
homology group of X with coefficients in F, and hi{X) = dirar Hi{X). The 
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main result of this paper is an upper bound on the threshold for the almost 
sure nonvanishing of Hd{X) for X ~ Xdiji^p). This question was addressed 
in [1] and the present paper is, in many ways, a continuation of that paper. 

To put this result in perspective, it may be useful to recall the situation for 
d = I'm. which case Xd{n,p) coincides with the Erdos-Renyi G{n,p) model of 
random graphs. For every 1 > c > 0, a random graph from G{n, -) contains 
a cycle with some probability 1 > /(c) > 0. Therefore, the threshold for the 
appearance of a 1-homology in G{n, -) is coarse. Namely, the probability 
that the random graph from G{n, -) contains a cycle, jumps from a positive 
number to 1 as c changes from 1 — e to 1. The behavior for dimension 6? > 1 is 
similar. This is due to the fact that ^A^+i, the boundary of a (d+ l)-simplex 
occurs with positive probability in Xd{n, -) for every c > 0. 

However, the situation in one dimension and in higher dimensions are 
quite different when we consider the size of the first occurring cycle. This 
issue is a little easier to discuss in the language of a random graph (or com- 
plex) process, where edges (resp. d-ia.ces) are added at random one at a time. 
For graphs, the length of the first generated cycle is distributed according to 
a certain known distribution [4]. In contrast, in the random rf- dimensional 
complex process, it follows from [1] that the first emerging cycle in Ha{X) is 
either SA^+i, or it has cardinality Q{n'^). We have conducted fairly exten- 
sive computer experiments with d = 2. In these experiments the situation 
was always that in the latter case, the first emerging cycle included all n 
vertices. Whether this can be proved and whether the same holds for general 
d remains a subject for further study. 

The degree of a (d — l)-face in a d-dimensional complex is the number 
of d-faces that contain it. A [d — l)-face of degree zero, i.e. one that is 
contained in no d-face is said to be isolated. A {d — l)-face of degree 1 is 
said to be free. The removal of a free {d — l)-face and the unique d-face that 
contains it is called an elementary collapse. We recall [5] that an elementary 
collapse is a homotopy equivalence. Given a complex X, we carry out a 
series of elementary collapses that take place in phases. At the beginning 
of a phase we list all {d — l)-faces in the complex which are currently free, 
and we scan them in an arbitrary order. As we arrive at a (rf — l)-face r in 
the list, one of two things can happen. It may still be free, in which case we 
apply to it an elementary collapse. It is also possible that when r is reached, 
it is already isolated, since the unique d-face that initially contained it was 
already eliminated in a previous elementary collapse. In this case we simply 
skip r. When we reach the end of the list, the current phase terminates and 



a new phase commences. 

We denote by Ri{X) the complex gotten from X at the end of phase i. In 
particular Ro{X) is just the randomly drawn complex with which we start. 
A (i-face from i?j_i(X) \ Ri{X) is said to be in generation i. A [d — l)-face 
is of generation i if its degree in i?j_i(X) is positive and it either does not 
belong to Ri{X) or is isolated there. 

Theorem 1.1. Let 1 > (3 = (3d > be the unique positive root of the equation 



e-Z^^ilijZ-Z. (2) 



and let c^ be defined as 

- ln_a - /3) 

If c > c*^, then a complex X drawn from Xd{n, -) satisfies asymptotically 
almost surely 

Hd{X) ^ 0. 

Specifically, C2 = 2.75381, C3 = 3.90708. Also P2 = 0.883414 and Ps = 
0.972498. For large d 

* /, .N d^ + d+1 ^. d"^ , 



exp{d + 1) exp(2(i)' 



{d+lf 



and /3d = 1 - exp(-(d + 1)) - (1 + Od(l))-^^p(2(d+i)) • 

Note: Here are a few words about our experiments for d = 2. We 
run the random process in which a random 2-face is added to the complex 
at each step. The experiment splits according to whether the first cycle 
to occur is dA^ or not. Conditioned on the first cycle not being dA^, the 
numerical estimates that we get for C2 for n = 50, 100, 200 are (expectation ± 
standard deviation) 2.70424 ± 0.03115, 2.72886 ± 0.01534, 2.74149 ± 0.00733 
respectively. This lends some support to our belief that the bounds attained 
in Theorem 1 . 1 is the true value of the threshold probability. 

The general strategy of our proof is this: An elementary collapse is a 
homotopy equivalence and in particular it preserves the homology of the 
complex. Using ideas similar to [1] we observe what happens as we sys- 
tematically collapse (in phases, as described above) every free {d — l)-face. 
An elementary collapse eliminates exactly one {d — l)-face and one d-face. 



However, it also happens that a non-isolated {d — l)-face becomes isolated 
through collapses on other {d — l)-faces. We denote by C} the (random) 
set of such faces. Also, let C° be the set of isolated {d — 1) faces in the 
original X. Finally let C* := C° U C} and let C* := \C*\. Observe that if 
fd{X) > fd~i{X) — (^, then Hd{X) ^ 0. Thus the main parts of the proof 
are these: 

• Local analysis of the collapsing process. 

• Computing the expectation E(C*). 

• A measure-concentration argument on the random variable C*- 

Our proof uses the fact that every (rf— l)-face in C* corresponds to a zero 
row in the inclusion matrix of {d — l)-faces vs. d-faces (after the collapses). 
There is another way of establishing the threshold, as done in the upper 
bound proof in [1]. It is possible to associate to every (d — l)-face in C* 
a cocycle in Z'^~^{X) and use the Euler-Poincare relation to give an upper 
bound on the threshold. Indeed our proof can be viewed as an extension of 
the argument of [1]. 

As the reader has probably noticed, the above explanation says nothing 
about the parameter /3 which plays a key role in the theorem. This is done 
in Section 4.1 below, where we provide a more comprehensive overview of 
the proof. 

2 The Probability Space Td(k, c) 

We analyze the sequence of (/-complexes which are obtained, starting from 
X and repeatedly collapsing, in phases. Our analysis seeks to determine the 
way at which a given face of dimension {d — 1) or d gets collapsed. Note 
that (^'s generation is completely determined by its local neighborhood in 
X. Concretely, if (p is of generation k, this can be ascertained by observing 
0's radius- (fc -|- 1) neighborhood. For every fixed k this neighborhood is 
almost surely a d-tree. We analyze the properties of this neighborhood using 
an intermediary - A Galton- Watson-like model of d-trees. This model is 
relatively easy to comprehend, and yet it provides a good approximation to 
the true local behavior of Xdin, -) at the vicinity of (p. This general strategy 
has been used numerous times, and in particular in [1]. 



We turn to provide the necessary definitions. We start with the (recursive) 
definition of a d-tree. A single d-iace is a d-tree. A d-tree on n + 1 vertices 
is obtained by taking a d-tree T on n vertices and adding to it a new d-face 
V U T and its [d — l)-skeIeton. Here t is a. {d — l)-face of T, and v is a new 
vertex. A rooted d-tree is a d-tree in which we designate one {d — l)-face to 
be the root. 

Associated with every d-complex Z is a graph Gz whose vertices are the 
d-iaces and the {d — l)-faces of Z. An edge between a d-iace and a {d — 1)- 
face stands for inclusion, and two {d — l)-faces are adjacent when they are 
contained in a d-iace of Z. We freely apply to Z graph-theoretic notions from 
Gz such as distance, diameter and radius from a vertex. Thus the distance 
between two {d — l)-faces in Z is the distance of the two corresponding 
vertices in Gz- 

We define a probability space Td{k,c) of rf-trees of radius < k from a 
{d — l)-face r that is the root of the tree. Thus 7^(0, c) is just the root 
[d — l)-face r. For k > we sample a d-tree from Td{k, c) as follows: 

• Sample a d-tree T from Td{k — 1, c). 

• For each {d — l)-face 6* in T at distance k — 1 from r 

— Sample an integer j from the Poisson(c) distribution. 

— Create j new vertices ti, . . .tj and add j new d- faces U ti to T 
for i = l,...,j. 

It is useful now to introduce a variation on the notion of the collapsing 
process. This is a process which we call 9-collapsing, where 6* is a (d— l)-face. 
This process is identical to the process of collapsing in phases, except that 6 
must not be collapsed, even when it happens to be free. We analyze how a 
random d-tree T G Td{k + 1, c) behaves under the r-collapsing process where 
r is the root of T. It is obvious that after A; -|- 1 phases of r-collapsing, T 
collapses to r, but we need to know whether r becomes isolated in phase 
A; -|- 1 or sooner. To this end we define the event Cr{k + l,d, c) that r belongs 
to generation earlier than r, where r < k + 1. We denote the probability 
of Cr{k + l,d,c) by •jrik + l,d,c). As mentioned above, whether or not r 
becomes isolated at time < r depends only on its radius-r neighborhood in T. 
In particular, jr{k + l,d,c) = 7r(r, d,c) ii k + 1 > r. We denote 7r(r, d, c) by 
'~fr{d, c). Let us calculate the numbers 7r(d, c) for small r. Clearly 7o(d, c) = 0. 



Also, Ci{k + 1, d, c) is the event where T consists only of its root r, so that 

7i{d,c) = e-^. (3) 

Notice that r, the root of a tree becomes isolated before the r-th phase iff 
each d-face a D t satisfies the following condition. There is a (d — l)-subface 
a D t' that we view as the root of a d-tree T' which we r'-coUapse. In the 
r'-collapsing of T', the root r' becomes isolated before phase (r — 1). Let 
TTj be the probability that a Poisson(c) random variable takes the value j we 
obtain: 



li 



.{d,c) = J2'^A^-i^-ir-iid,c)y 



j=0 



J2-^~''(^-(^-^r-l{d,c)ry= (4) 
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exp(-c(l-7^_i(c/,c))'^) . 



We denote by Bk{d, c) the event that the root of T G Td{k + l,c) belongs 
to a generation later than k. The probability of this event is (3k{d, c). Clearly 

Pk id,c) = 1- 7fc+i id,c). (5) 

3 The Neighborhood of a (d — l)-face 

The next step is rather standard in arguments of the sort we are making. 
Most of the necessary details are to be found in [1], and we now provide 
a few additional comments and explanations. The purpose is to show that 
the Poisson-distribution-based tree considered above approximates arbitrar- 
ily closely (as n — )■ oo) the actual local behavior of our random complex. 

How does the neighborhood of a (d— l)-face r in a c?- dimensional complex 
X look like? The 0-neighborhood So, consists of r alone. The z-th neighbor- 
hood Si is the complex generated by the t/- faces in Si-i, and the additional 
d-faces that contain a. [d — l)-face in iSj_i. We denote by Vi the number of 
vertices in Si. 

Let Ak be the event (in Xa{n,p)) that S^ is a d-tree. Let D be the event 
that every {d — l)-face in X ~ Xd{n,p) has degree < logn. 

The argument in [1] has two parts. One shows first 
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Claim 3.1. Let k and c> be fixed and p = -. Then 

FT[Af,nD] = 1-0(1). 

The next step is to sliow tliat conditioned on the event AkCiD the following 
recursive random process generates the typical /c-neighborhood in X. As 
before Sq = t and for i > 0, with iSj_i already in place, the next layer Si is 
generated according to the following rule: For each [d — l)-face 6 in 5j_i at 
distance i — 1 from r 



Sample an integer j from the binomial distribution B{n — fj_i, 



c ' 



• Create j new vertices ti, . . .tj and add j new d-faces ^ U tj to Si for 
i = l,...,j. 

The only difference between this random process and the way we defined 
Td{k,c) is that we sample the integer j from B{n — Vi-i, -) and not from 
Poisson(c). Notice that ii X ^ A^ H D then v/^ = 0(log n) and the total 
variation distance between S^ and Td{k,c) is o(l). 

Thus for a. {d — l)-face r 

. Pr(deg^^(;,)(r) > 0) = (1 - o(l))/3,(6?, c), 

. Pr(deg^^,_^(^)(r) = 0) = (1 - o(l))7fc(rf, c)). 

Consider an inclusion t G a oi a. {d— l)-face and a d-face. Let S^. be the 
k-th neighborhood of r in X\ct. We can apply Claim 3.1 to S*^ and conclude 
that with probability 1 — o(l) it is a d-tree in which every {d — l)-face has 
degree at most logn. To randomly generate 5"^ we just run the random 
process that generates Sk and modify it, by excluding a from S[. Thus 

. Pr(deg^^(;,)\,(r) > 0) = (1 - o(l))/3,((i, c), 
. Pr(deg^^_^(;,)\Jr) = 0) = (1 - o(l))7fc(6?, c)). 

4 An Upper Bound For The Threshold 

As usual, we associate a boundary operator with the (i- dimensional complex 
X. This linear operator corresponds to an fd-i{X) x fd{X) matrix M whose 
rows and columns are indexed by X's {d — 1) resp. d-ia.ces. All entries of M 



are in {—1,0,1} and are defined as follows. Every d-ia.ce cr G X is given an 
orientation [vq, . . . ,Vd], and M„\y.^a- '■= (~1)* for every < i < d. All other 
entries of M equal 0. Since X is (i-dimensional, hd{X) = iff M's right 
kernel is zero. 

Let A^ be an a X 6 matrix and let Q = C{N) be the number of zero 
rows in A^. Clearly A^ has a nonzero right kernel ii b > a + (. We apply 
this simple observation to Mj, the matrix associated with Ri{X). Since an 
elementary collapse is a homotopy equivalence, Hd{Ri{X)) = Hd{X) for all 
i. We conclude that if Si{X) = fd{Ri{X)) - fd^i{Ri{X)) + ^{X) > for 
some i > 0, then Hd{X) ^ 0. 

Our proof shows that if c > c^ then X ~ Xd{n, -) satisfies a.s s{X) = 
Sk,{X) > 0. Here /c* is a large enough constant to be determined later. First 
we calculate the expectation of s{X), and then show that a.s s{X) > 0. 

Theorem 4.1. Let p = - with c> c*^. Then 

Pr [ X G Xd{n,p) : s{X) > 0] = 1 - o„(l) . 

As the previous discussion shows, this theorem implies Theorem 1.1. 

Proof: Fix c and d and let /3fc, 7^ stand for f5k{d-, c), 7fc(c^, c) resp. and let 
us fix an arbitrarily small e > 0. Let C*{X) = (^{Mk,{X)). 

A d-ia.ce a is in R^^ (X) if a G X and it is not collapsed in phase /c^. or 
earlier. In particular, at the end of phase k^ — 1, every (d — l)-subface of a 
must be contained as well in a rf-face other than a. Hence 

nURk,iX))] = (1 - 0(1)) (^ ^ ^) ^/3tt\ (6) 

Let r be a (d — l)-face. Let Qr be the event that r becomes isolated 
after A;* r-coUapsing phases. The discussion in Sections 2 and 3 yields that 
Pr(Qr) = (1 ~ o(l))7fc,+i. Let Vr be the event that r collapses after /c* 
collapsing phases, but does not become isolated after k^ r-coUapsing phases. 
Equivalently, r becomes a free subface of some d-face a before collapsing 
phase fc*, but all other {d — l)-subfaces of a are not free prior to collapsing 
phase k^. Consequently, Pr(CT-) = (1 — o{l))j^njk*l3f^_i. 

The row corresponding to a {d — l)-face r becomes a zero row or is 
removed from the matrix after the collapsing phases if r is either isolated or 
was collapsed. Hence this happens only if the event Qt- UP,- occurs. Thus the 



probability that some {d— l)-face belongs to the complex and is not isolated 
after k^, collapsing phases is (1 — o(l))(l — (7^,+! + c ■ 7fc,/3fc^_i))- Therefore 



n 



E[/,_i(i?,,(X)) -a.(X)] = (1 -o(l))(^^ Kl - (7fc.+i + c-7../^l_i)) (7) 
Consequently, 



E[s] =E[/,(i?fc.(X))] -E[/,_i(/2fc.(X)) + C 



-/3fc. +c/3t_i(l-/3fc.„i) + c 




d+1 



(8) 



4.1 overview 

We are now ready to provide a more detailed explanation of our strategy of 
proof for Theorem 1.1. We fix an integer d > 2 once and for all, and some 
c > whose value we discuss below. With this fixed c, we obtain a recurrence 
relation for Pk- (This follows readily from Equation (4)). Namely, starting 
with t = 1 we recurse on t — )■ 1 — f{t), where /c(t) = f(t) = exp(— c ■ f^). 
It is easily verified that for every c > the function 1 — /(■) is increasing in 
[0, 1]. As already observed in [1] there is some c^ollaDse ^ ^ depending only 
on d so that when C(,Qjjg^„gg > c > 0, the only root of 1 — /c(t) = t in [0, 1] is 
t = 0. Therefore, for Cj^Qjjg^pgg > c > the recurrence t — )■ 1 — /(t) started 
at t = 1 converges to zero. 



.■■■■■■■■ 



1-/(0 

— 1-/(0 



Figure 1: The recurrence relation for /3k 



Routine calculations (see also Figure 2) yield that for c > c^ollanse ^^^^^ 
are exactly two roots in (0, 1) to 1 — fc{t) = t. In this range the above 
recurrence converges to the larger of these two roots. As Equation (8) shows, 
E(s) > iff 

In the statement of Theorem 1.1 the same calculations are done "in re- 
verse". Namely, Equation (2) states that 1 > /3 > is a root of 1 — /c*(t) = t. 
It only remains to rule out the possibility that Equation (1) yields the smaller 
root. To this end, note that the solution of Equation (1) is /3 > 1 — exp(— (i). 
Moreover, the larger/smaller root of 1 — fc{t) = t increases resp. decreases 
with c and the smaller root is smaller than l—exp{—d) for every c > c^Qjjg^pgg. 



..'V 



1-/(0 c<c 



collapse 



1-/(0 OC 



collapse 



Figure 2 

Let k^ be large enough s.t f3k,-i — /3* < e. Since c > c^ we conclude that 
for n large enough 



EW >.'(:) 



(9) 



where e' > depends only on c and d. 



4.2 A concentration of measure argument 

The only missing part of the proof is that s > almost surely. This is shown 
using the following version of Azuma's inequality from [8]. 
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Theorem 4.2. Let Yi, . . . , Ym he random variables taking values from {0, 1}. 
Suppose $ : {0, 1}'" — )• M satisfies: 

\^{x)-^{x')\ <efc 

for X and x' that differ only at their k-th coordinate. Then for every t > 0: 



Pr[|<l>(yi,...,K^)-E[$(yi,...,y^)]|>t]<2e ^^< (10) 

Let ai, . . . ,(7( n \ be the list of all d-faces, and let Yi be the indicator 

\d+l) 

random variable of the event a-i E X. We apply this theorem with m = {Jl^) 
and with $ = s. Let us consider two d-complexes X and X' that are identical, 
except that a E X, but a ^ X'. We need to provide an upper bound on 
\s{X) — s{X')\. In an elementary collapse we eliminate one d-face and one 
(d-l)-face. Thus, in particular /,(i?,(X)) -/rf_i(i?,(X)) = /,(X) -/rf_i(X) 
for every i. Clearly fd{X) - fd{X') = 1 and fd-i{X) = fd-i{X'), so we only 
need a bound on |C*(X') — C*(X)|. As we show 

(rf+i)-d'=*>|C(x')-CW|. 

We now compare the collapsing processes as they evolve in X and in X'. 
If a {d — l)-face is of generation i,i' < k^ in X,X' respectively, then clearly 
i > i' . Let G be the set of [d — l)-faces in X for which i > i'. Clearly G is 
contained in the /c^-neighborhood of a. We classify the faces in G according 
to their distance from a and show that at distance j from a there are at most 
(rf + 1) ■ d^~^ members of G. This is clearly true for j = 1, namely the d + 1 
subfaces of a. The general claim is shown by induction on j. A {d — l)-face 
r G G at distance j + 1 from a must have a neighbor (in Gx), a (d — l)-face 
r' G G whose distance from a is j so that r' is of generation one earlier than 
r. In particular, there is a rf-face ip that contains both these r and r' and is 
collapsed through r'. 

So let r' G G be a face of generation z/' in X' whose distance from cr is j. 
Since r' can collapse only one d-face, it can have at most d neighbors in G, 
which are {d— l)-faces at distance j + 1 from a. We can conclude that there 
are at most [d + l)d^ faces in G at distance j + I from a. 

Clearly \UX') - C*(X)| < |G|, and |G| < Ei=i(t^ + 1)^'"' < id + l)d''* 
by the previous discussion. Thus \s{X) — s{X')\ < {d+ 1)*^*+^, by using (9) 
and (10) we conclude: 
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Fr[s < 0] < 2e U^W^^^^^^ = o(l). 

D 



5 Open problems 

• The most obvious remaining challenge is to determine the correct thresh- 
old for the non- vanishing of the d-th homology in Xd{n,p). As stated 
before we believe cjj is this threshold. 

• The present results and those of [1] strongly suggest that the threshold 
for collapsibility is substantially smaller than the one for the almost 
sure non- vanishing of the d-th homology. Can one at least show that 
the two thresholds do not coincide? 

• In the random complex process, what is the distribution of the first 
emerging cycle in Hfi{X)l In particular, can one prove that (as sug- 
gested by our numerical experiments) it is either c^A^+i or else it in- 
cludes all n vertices? 

• It would be extremely interesting to investigate the inclusion matrices 
of {d — l)-faces vs. rf-faces of complexes in Xd{n, -) for values of c 
between the two thresholds (assuming, of course, that they differ). If 
the conjectures alluded to in the above questions hold, then this matrix 
has excellent properties, when viewed as the parity-check matrix of an 
error-correcting code. 
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